<html>
<head><meta charset="utf-8"><title>Canonical names/structure for syntax trees · t-compiler · Zulip Chat Archive</title></head>
<h2>Stream: <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/index.html">t-compiler</a></h2>
<h3>Topic: <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Canonical.20names.2Fstructure.20for.20syntax.20trees.html">Canonical names/structure for syntax trees</a></h3>

<hr>

<base href="https://rust-lang.zulipchat.com">

<head><link href="https://rust-lang.github.io/zulip_archive/style.css" rel="stylesheet"></head>

<a name="205399435"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Canonical%20names/structure%20for%20syntax%20trees/near/205399435" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> matklad <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Canonical.20names.2Fstructure.20for.20syntax.20trees.html#205399435">(Jul 29 2020 at 19:01)</a>:</h4>
<p>cc <span class="user-mention" data-user-id="120518">@Eric Huss</span> , <span class="user-mention" data-user-id="123856">@Vadim Petrochenkov</span> , <span class="user-mention" data-user-id="119009">@eddyb</span> </p>
<p>I am looking into "finalizing" rust-analyzer syntax tree structure in terms of nonterminal &amp; field names. The current shape is specified <a href="https://github.com/rust-analyzer/rust-analyzer/blob/master/xtask/src/codegen/rust.ungram">in this grammar file</a>.</p>
<p>Do we already have a "canonical" set of names?  Two plausible candidates are:</p>
<ul>
<li><a href="https://github.com/rust-lang/wg-grammar/tree/master/grammar">https://github.com/rust-lang/wg-grammar/tree/master/grammar</a></li>
<li><a href="https://doc.rust-lang.org/reference/">https://doc.rust-lang.org/reference/</a></li>
</ul>
<p>But they don't agree with each other :-)</p>



<a name="205399571"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Canonical%20names/structure%20for%20syntax%20trees/near/205399571" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> matklad <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Canonical.20names.2Fstructure.20for.20syntax.20trees.html#205399571">(Jul 29 2020 at 19:02)</a>:</h4>
<p>My broader goal is to produce a sort-of "standard" naming, reusable across tools (rust-analyzer, rutc, IntelliJ Rust, ideally the reference), so I want to get this right.</p>



<a name="205399678"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Canonical%20names/structure%20for%20syntax%20trees/near/205399678" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> matklad <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Canonical.20names.2Fstructure.20for.20syntax.20trees.html#205399678">(Jul 29 2020 at 19:03)</a>:</h4>
<p>Is this a good idea to share ast node names between the reference and the implementation? </p>
<p>Currently, the reference has productions named like <code>FunctionReturnType</code> and, while it's the best name for the reference, for implementation I'd rather pick something shorter.</p>



<a name="205402232"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Canonical%20names/structure%20for%20syntax%20trees/near/205402232" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Eric Huss <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Canonical.20names.2Fstructure.20for.20syntax.20trees.html#205402232">(Jul 29 2020 at 19:26)</a>:</h4>
<p>I do not think there is anything "canonical".  If you ask 5 people to come up with some names, you will probably get 5 different answers. <span aria-label="grinning" class="emoji emoji-1f600" role="img" title="grinning">:grinning:</span>  You can also toss <code>rustc</code>'s own naming convention as well as <code>syn</code> into the ring and let them all duke it out.   </p>
<p>As for the reference, I am unsure.  Most language specs use relatively verbose production names ("function-declaration", "FunctionDecl", "external-parameter-name", etc.), and I don't think brevity is terribly important there, whereas clarity is.  Also, it has not been decided what the reference will eventually contain.  Again, most language specs use a simple grammar, but rust's grammar has become increasingly complex and difficult to express in a simple form.</p>



<a name="205403210"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Canonical%20names/structure%20for%20syntax%20trees/near/205403210" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> matklad <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Canonical.20names.2Fstructure.20for.20syntax.20trees.html#205403210">(Jul 29 2020 at 19:34)</a>:</h4>
<p>Uhu. And was is the current state of the reference, grammar wise? Am I correct that now it is already „quite good“? I think I’ll try to use the reference &amp; syn as two primary targets to normalize rust-analyzer‘s „grammar“ against.</p>
<p>Longer term, i am also wondering if we will be able to generate <em>current</em> rustc‘s AST from the same grammar...</p>



<a name="205403580"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Canonical%20names/structure%20for%20syntax%20trees/near/205403580" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Eric Huss <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Canonical.20names.2Fstructure.20for.20syntax.20trees.html#205403580">(Jul 29 2020 at 19:38)</a>:</h4>
<p>I would not characterize the reference as "good".  The reference has fallen behind quite a bit, and there are a number of bugs.  I'm reluctant to say the wg-grammar grammar is up-to-date either, because it doesn't include any disambiguation or tests.  <code>rustc</code> is really the only clear source right now.  I also think <code>syn</code> is probably pretty good, but I have not looked at it closely recently.</p>



<a name="205403836"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Canonical%20names/structure%20for%20syntax%20trees/near/205403836" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Eric Huss <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Canonical.20names.2Fstructure.20for.20syntax.20trees.html#205403836">(Jul 29 2020 at 19:40)</a>:</h4>
<p>Even rustc has its bugs. The one that came up recently is <a href="https://github.com/rust-lang/rust/issues/74854">https://github.com/rust-lang/rust/issues/74854</a>.  It has never been clear to me why <code>{{1}+1}</code> doesn't parse.  It doesn't seem like it is ambiguous to me, but it is not allowed.  I guess, do you want to replicate all of rustc's bugs as well? <span aria-label="smile" class="emoji emoji-1f642" role="img" title="smile">:smile:</span></p>



<a name="205403884"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Canonical%20names/structure%20for%20syntax%20trees/near/205403884" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> matklad <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Canonical.20names.2Fstructure.20for.20syntax.20trees.html#205403884">(Jul 29 2020 at 19:41)</a>:</h4>
<p>Uhu. Luckily, I don‘t need to care about ambiguities. What I am looking at is <em>just</em> the structure of the syntax trees (which nodes can have which children), and that is actually much easier to specify, as you don‘t need to care about „no struct literals“.</p>



<a name="205404035"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Canonical%20names/structure%20for%20syntax%20trees/near/205404035" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> matklad <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Canonical.20names.2Fstructure.20for.20syntax.20trees.html#205404035">(Jul 29 2020 at 19:42)</a>:</h4>
<p>that‘s why the grammar thing I am using is called „ungrammar“ :) We‘ll need to spec parser as well some day, of course, but I want to postpone this for now.</p>



<a name="205404273"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Canonical%20names/structure%20for%20syntax%20trees/near/205404273" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Eric Huss <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Canonical.20names.2Fstructure.20for.20syntax.20trees.html#205404273">(Jul 29 2020 at 19:44)</a>:</h4>
<p>If you do look at the wg-grammar, it would probably be best to start with this PR: <a href="https://github.com/rust-lang/wg-grammar/pull/68">https://github.com/rust-lang/wg-grammar/pull/68</a>  It's a big update, but it was never merged.</p>



<hr><p>Last updated: Aug 07 2021 at 22:04 UTC</p>
</html>